Temporal difference learning in complex domains

نویسنده

  • Martin C. Smith
چکیده

...................................................................................................................................................1 TABLE OF CONTENTS............................................................................................................................ iii LIST OF TABLES........................................................................................................................................vi LIST OF FIGURES.....................................................................................................................................vii

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some Explorations in Reinforcement Learning Techniques Applied to the Problem of Learning to Play Pinball

Historically, the accepted approach to control problems in physically complicated domains has been through machine learning, due to the fact that knowledge engineering in these domains can be extremely complicated. When the already physically complicated domain is also continuous and dynamical (possibly with composite and/or sequential goals), the learning task becomes even more difficult due t...

متن کامل

Control of Multivariable Systems Based on Emotional Temporal Difference Learning Controller

One of the most important issues that we face in controlling delayed systems and non-minimum phase systems is to fulfill objective orientations simultaneously and in the best way possible. In this paper proposing a new method, an objective orientation is presented for controlling multi-objective systems. The principles of this method is based an emotional temporal difference learning, and has a...

متن کامل

TDγ: Re-evaluating Complex Backups in Temporal Difference Learning

We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an nstep return estimate increases with n. We introduce the γ-return estimator, an alternative target based on a more accurate model of variance, which defines the TDγ family of complex-backup temporal difference learning algorithms. We derive T...

متن کامل

Speeding up Tabular Reinforcement Learning Using State-Action Similarities

One of the most prominent approaches for speeding up reinforcement learning is injecting human prior knowledge into the learning agent. This paper proposes a novel method to speed up temporal difference learning by using state-action similarities. These handcoded similarities are tested in three well-studied domains of varying complexity, demonstrating our approach’s benefits.

متن کامل

Transfer of Knowledge Structures with Relational Temporal Difference Learning

The ability to transfer knowledge from one domain to another is an important aspect of learning. Knowledge transfer increases learning efficiency by freeing the learner from duplicating past efforts. In this paper, we demonstrate how reinforcement learning agents can use relational representations to transfer knowledge across related domains.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999